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SUMMARY 

This  paper  describes  WordNet,  an  on-line  lexical  reference  system  whose  design  is 
based  on  psycholinguistic  theories  of  human  lexical  organization  and  memory.  English 
nouns,  verbs,  and  adjectives  are  organized  into  synonym  sets,  each  representing  one 
underlying  lexical  concept.  Synonym  sets  are  then  related  via  three  principal  conceptual 
relations:  hyponymy,  meronymy,  and  antonymy.  Verbs  are  additionally  specified  for 
presupposition  relations  that  hold  among  them,  and  for  their  most  common  semantic/ 
syntactic  frames.  By  attempting  to  mirror  the  organization  of  the  mental  lexicon,  Word- 
Net  strives  to  serve  the  linguistically  unsophisticated  user. 
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George  A.  Miller,  Christiane  Fellbaum,  Judy  Kegl,  and  Katherine  Miller 

Cognitive  Science  Laboratory 
Princeton  University 

Introduction 

WordNet  is  an  electronic  lexical  reference  system  for  English,  designed  in  accor¬ 
dance  with  psycholinguistic  theories  of  the  organization  of  human  lexical  memory.  This 
novel  lexical  reference  system  for  English  is  being  developed  in  the  form  of  an  electronic 
database.  Its  design  derives  from  psychological  and  linguistic  theories  about  how  lexical 
information  is  organized  and  stored  in  the  memories  of  people  who  know  English  well 
and  speak  it  fluently.  The  success  of  this  experimental  system  would  demonstrate  the 
adequacy  of  the  theories  from  which  it  derives,  but  even  if  those  theories  must  be  revised 
or  replaced,  the  lexical  database  that  is  being  developed  in  order  to  test  them  will  be 
adaptable  to  a  variety  of  practical  applications.  WordNet,  supplemented  on-line  by  a 
machine-readable  dictionary  and  made  available  via  a  multi-window  workstation,  can  be 
profitably  incorporated  into  any  task  that  is  facilitated  by  easy  access  to  lexical  informa¬ 
tion. 

Word  knowledge  is  analyzed  into:  (1)  the  sound  pattern,  (2)  the  concept  that  the 
sound  pattern  can  express,  and  (3)  the  association  of  sound  and  concept.  Sounds  and 
concepts  are  learned  differently:  as  a  consequence,  different  kinds  of  lexical  relations  are 
established:  (1)  phonological  (e.g.,  rhyme)  and  morphological  relations  (e.g.,  inflection, 
derivation,  compounding)  are  word-specific,  whereas  (2)  semantic  relations  (e.g., 
synonymy,  subordination,  part-whole)  are  truth-functional. 

Both  kinds  of  relations  are  incorporated  in  WordNet.  A  concept  is  represented  by  a 
set  of  synonyms  that  can  be  used,  in  appropriate  contexts,  to  express  it;  other  semantic 
relations  are  represented  by  labeled  pointers  between  the  related  concepts.  WordNet  will 
test  the  adequacy  of  current  ideas  about  the  structure  of  the  lexicon  by  testing  whether  a 
realistically  large  sample  of  the  English  lexicon  can  be  represented  in  this  way. 

The  use  of  synonym  sets  is  both  an  innovative  and  an  expedient  approach  to  dirtion¬ 
ary  design.  Standard  dictionaries  develop  uniform  semantic  representations  for  all  the 
lexical  items  in  English  by  systematizing  the  writing  of  sense  definitions  or  by  determin¬ 
ing  a  set  of  linguistic  primitives  that  constitute  the  meaning  of  lexical  items.  WordNet 
circumvents  the  writing  and  systematizing  of  sense  definitions  by  representing  concepts 
as  relation*  among  words  arranged  in  a  ’  vocabulary  matrix,"  a  giant  network  coding  vari¬ 
ous  relations  by  means  of  connections  between  words.  It  simply  looks  along  a  given  row 
of  the  vocabulary  matrix,  notes  all  the  words  that  can  be  used  to  express  the  same 
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concept,  and  then  substitutes  that  synonym  set  for  the  statement  of  the  concept.  If  one 
accesses  the  dictionary  by  way  of  the  horizontal  word  list,  one  gets  a  view  of  the 
polysemy  of  a  word  (all  the  different  concepts  that  the  word  can  be  associated  with).  On 
the  other  hand,  if  one  accesses  the  matrix  from  the  vertical  concept  list,  one  gets  a  row 
containing  all  the  different  synonymous  words  that  express  a  given  concept. 

Once  the  basic  matrix  is  in  place,  an  elaborate  system  of  cross-referencing  allows 
the  coding  of  various  relations  between  synonym  sets,  including  relations  of  antonymy, 
superordination,  subordination,  part-whole,  grading,  and  presupposition.  Finally,  more 
complex  relations  termed  "theories"  can  be  encoded,  including  topics,  semantic  fields, 
and  areas  of  discourse.  WordNet  is  free  from  any  requirement  to  encode  all  the  informa¬ 
tion  about  a  word  in  the  confines  of  a  single  entry.  Furthermore,  the  nonlinear  nature  of 
this  net  together  with  the  freedom  afforded  by  computer  access  captures  many  important 
relations  obscured  by  the  formatting  constraints  of  hand-held  dictionaries. 

Psycholinguistic  Issues 

What  a  language  user  must  know  and  how  that  knowledge  is  organized  are  related 
but  separable  questions.  In  order  to  speak  and  understand  any  language,  it  is  necessary  to 
know  the  sounds  and  meanings  of  thousands  of  different  lexical  units — some  idea  of 
what  a  language  user  must  know  can  be  gathered  from  reading  an  ordinary  desk  diction¬ 
ary. 

How  that  lexical  knowledge  is  organized,  however,  is  a  much  more  difficult  ques¬ 
tion.  In  a  printed  dictionary  it  is  organized  alphabetically.  In  a  person’s  memory  the 
organization  is  much  more  complex.  Lexical  memory  must  be  so  organized  that  the 
sounds  and  the  contextually  appropriate  meanings  of  thousands  of  different  words  can  be 
retrieved  from  memory  at  rapid  rates.  The  conversational  use  of  language  would 
scarcely  be  possible  unless  the  lexical  memory  system  were  well  organized  to  support 
such  rapid  retrieval.  The  nature  of  this  organization  and  how  it  comes  to  be  constructed 
during  the  process  of  learning  a  language  are  basic  questions  for  psycholinguistic 
research.  Questions  about  the  organization  of  lexical  memory  are  easier  to  consider,  how¬ 
ever,  if  one  first  becomes  clear  about  what  a  language  user  must  know. 

A  vocabulary  matrix  is  sufficiently  general  to  represent  any  lexicon,  whether  it 
exists  in  a  person,  in  a  book,  or  in  a  computer.  It  contains  a  representation  of  the  phono¬ 
logical  form  of  a  word  and  a  representation  of  the  conceptual  content  of  the  word,  along 
with  the  associative  bond  connecting  them.  The  vocabulary  matrix  is  not  a  complete 
model  of  a  human  language  user’s  lexical  knowledge,  however.  A  good  model  of  a 
person’s  lexical  knowledge  would  have  to  include  the  phonological  and  morphological 
features  of  the  words  and  the  semantic  and  pragmatic  relations  among  lexical  concepts. 

Lexical  Relations 

The  vocabulary  matrix  captures  the  basic  structure  of  lexical  memory,  but  it 
neglects  the  complex  relations  that  exist  between  words. 

Phonological  relations  like  rhyme,  and  morphological  relations  between  derivatives 
(e.g.  navy  and  naval,  or  high,  higher,  and  highest )  or  within  compounds  (e.g.,  ship, 


board,  and  shipboard,  or  pocket,  pick,  and  pickpocket),  are  real  and  recognizable  to  any¬ 
one  who  knows  English  but  are  not  shown  in  the  vocabulary  matrix.  Judgments  of  such 
relations  between  words  depend  on  familiarity  with  the  spoken  patterns;  they  are  rapid 
and  accurate  for  highly  practiced  words  but  slow  and  unreliable  for  infrequently  used  and 
unfamiliar  words. 

Conceptual  relations  are  not  shown  in  the  vocabulary  matrix,  either.  A  wide  variety 
of  such  relations  have  been  studied  by  psycholinguists  (Chaffin  &  Herrmann,  1984).  For 
example,  subordination  and  superordination  (e.g.,  a  maple  is  a  tree,  and  a  tree  is  a  plant), 
which  are  relations  between  concepts,  do  not  appear  in  a  simple  listing  of  lexical  con¬ 
cepts.  Linguists  and  lexicographers  refer  to  subordination  as  hyponymy.  Hyponymy 
generates  a  hierarchical  structure,  a  taxonomy,  in  the  lexicon. 

The  part- whole  relation  is  also  a  relation  between  concepts,  not  between  words  (Iris, 
Litowitz,  &  Evens,  1985).  Simple  examples  are  easily  found  (for  example,  a  car  has  an 
engine,  an  engine  has  a  carburetor,  and  a  carburetor  has  a  flutter  valve).  Like  hyponymy, 
menonymy  exhibits  a  hierarchical  organization  where,  instead  of  the  ISA  relation,  the 
HASA  relation  is  exploited.  Meronymy  is  the  term  used  to  refer  to  the  part-whole  relation: 
flutter  valve  is  a  meronym  of  carburetor  and  carburetor  is  a  meronym  of  engine. 

No  adequate  theory  of  the  organization  of  lexical  memory  can  ignore  the  strong  for¬ 
mal  relations  between  the  columns  or  the  strong  semantic  relations  between  the  rows  of 
the  vocabulary  matrix.  Lexical  relations  must,  therefore,  be  included  in  any  electronic 
system  that  hopes  to  simulate  the  structure  of  human  memory.  The  vocabulary  matrix  is 
merely  a  skeleton;  it  must  be  fleshed  out  with  many  formal  and  conceptual  relations. 

Sources  of  Evidence 

Any  theory  must  rest  on  a  body  of  factual  data.  Two  rather  different  kinds  of  fac¬ 
tual  data  are  available  to  support  claims  about  the  organization  of  lexical  memory.  One 
is  linguistic:  the  data  underlying  theories  of  lexical  organization  are  conveniently  sum¬ 
marized  in  printed  dictionaries  and  thesauruses.  The  second  is  psychological:  a  variety 
of  experimental  investigations  have  provided  evidence  for  the  psychological  reality  of 
the  hypothesized  mental  structures.  A  few  words  about  each  should  suffice  to  indicate 
the  general  character  of  the  available  data. 

First,  the  linguistic  evidence.  Dictionaries  and  thesauruses  that  summarize  the 
relevant  linguistic  information  derive  ultimately  from  the  recorded  use  of  the  language 
by  native  speakers  and  from  native  speakers’  subjective  judgments. 

How  words  are  strung  together  in  sentences  and  larger  units  of  discourse  provides 
necessary  information  for  a  person — child  or  foreigner — trying  to  learn  the  vocabulary  of 
a  language.  The  ability  to  produce  acceptable  sentences  is  an  important  indicator  that  the 
writer  or  speaker  knows  the  words  they  contain.  Lexicographers  collect  such  sentences, 
classify  them  according  to  the  words  they  contain,  and  cite  them  as  the  bases  for  the 
definitions  that  they  put  in  their  dictionaries.  When  dealing  with  a  dead  language,  the 
written  corpus  is  the  only  evidence  available.  However,  the  inconvenience  of  this  kind  of 
evidence  is  that  there  are  many  different  words,  many  of  them  relatively  rare,  and  enor¬ 
mous  quantities  of  text  must  sometimes  be  searched  in  order  to  turn  up  a  mere  handful  of 
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examples  of  sentences  using  the  word  that  is  being  studied. 

In  addition  to  corpus-based  lexicography,  some  linguists  and  lexicographers  also 
rely  on  native  speaker  intuitions.  Since  a  native  speaker  is  competent  to  produce  and 
understand  an  indefinite  variety  of  sentences  containing  any  particular  word,  his  or  her 
implicit  knowledge  of  the  language  provides  a  basis  for  subjective  judgments  that  can  be 
used  as  primary  data. 

Both  linguists  and  psychologists  have  developed  methods  to  tap  into  the  linguistic 
intuitions  of  others.  For  example,  psychologists  sometimes  give  native  speakers  a  word 
and  ask  what  other  words  it  suggests,  or  they  may  constrain  the  person’s  associations  by 
specific  instructions,  such  as  "What  is  a  kind  of  plant?"  or  "List  all  the  trees  you  can  think 
of."  Judgments  that  ISA  or  HAS  A  relations  hold  take  the  form  of  judgments  of  the  truth  or 
falsity  of  such  statements  as  "A  maple  is  a  tree"  or  "A  gasoline  engine  has  a  carburetor." 
General  world  knowledge  is  involved  in  such  judgments,  of  course.  Linguists,  on  the 
other  hand,  are  more  likely  to  frame  questions  in  terms  of  sentences,  such  as  "Do  S,  and 
S2  have  the  same  meaning?"  where  Sj  and  S2  are  identical  sentences  except  for  a  pair  of 
words  whose  meanings  are  to  be  compared.  Or  they  may  ask  for  judgments  of  oddness, 
for  example,  "pines  and  other  maples"  sounds  odd,  "trees  and  other  maples"  sounds  odd, 
but  "pines  and  other  trees"  does  not. 
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The  experimental  evidence  gathered  by  psychologists  is  of  a  different  nature.  By 
and  large,  psycholinguistic  experiments  presuppose  the  validity  of  the  general  structures 
that  linguists  and  lexicographers  have  identified  and  try  instead  to  test  hypotheses  con¬ 
cerning  the  way  such  structures  arise  or  how  they  contribute  to  other  cognitive  processes. 

For  example,  linguists  distinguish  between  open  class  and  closed  class  words.  Open 
class  words  are  nouns,  verbs,  adjectives,  and  most  adverbs;  the  language  has  a  great 
many  different  open  class  words,  and  new  ones  can  easily  be  added  to  the  vocabulary  as 
needed.  Closed  class  words  are  articles,  prepositions,  conjunctions,  and  some  adverbs; 
English  has  a  limited  number  of  them  (around  100);  they  provide  important  information 
about  the  syntactic  structures  of  sentences,  and  new  ones  are  difficult  to  add  to  the 
language.  Psychologists  have  adopted  this  distinction,  calling  open  class  items  content 
words  and  closed  class  items  function  words . 

Psychologists  have  found  a  variety  of  behavioral  data  to  be  correlated  with  this  dis¬ 
tinction.  For  example,  hesitations  in  conversational  speech  tend  to  occur  before  content 
words,  not  before  function  words  (Goldman-Eisler,  1968).  Or,  to  take  a  different  exam¬ 
ple,  good  readers  tend  to  direct  their  gaze  at  content  words  and  to  skip  over  function 
words.  Since  there  are  relatively  few  function  words  and  they  are  used  in  every  sentence, 
they  occur  much  more  frequently  than  do  content  words;  consequently,  psychologists 
translate  the  content/function  distinction  into  a  word-frequency  distinction.  It  is  the 
infrequent  and  unpredictable  words  that  cause  a  speaker  to  hesitate,  and  the  less  fre¬ 
quently  a  word  is  used,  the  more  time  a  reader  will  spend  looking  at  it  (Carpenter  &  Just, 
1983). 
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This  word-frequency  effect  is  also  found  in  other  experiments.  In  the  lexical  deci¬ 
sion  task,  for  example,  readers  are  asked  to  decide  as  quickly  as  they  can  whether  a  par¬ 
ticular  string  of  letters  spells  an  English  word  (Whaley,  1978).  It  has  been  found  that  the 
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time  required  to  say  Yes  to  actual  words  decreases  as  the  word’s  frequency  of  occurrence 
increases  (Gordon,  1985).  A  wealth  of  such  results  strongly  suggests  that  a  person’s 
access  to  lexical  information  in  memory  is  faster  and  easier  the  more  often  the  word  has 
been  encountered  previously. 

The  conceptual  dimension  of  lexical  memory  has  also  been  explored  experimentally 
by  psychologists.  One  of  the  landmark  studies  was  the  work  of  Collins  and  Quillian 
(1969).  They  reported  that  it  takes  people  longer  to  judge  the  truth  of  the  statement  A 
canary  is  an  animal  than  to  judge  A  canary  is  a  bird.  They  attributed  such  observations 
to  the  fact  that  bird  is  the  immediate  superordinate  of  canary,  whereas  animal  is  a  more 
remote  superordinate. 

Collins  and  Quillian ’s  paper  stimulated  extensive  research  into  the  organization  of 
semantic  memory.  That  work  need  not  be  summarized  here;  an  excellent  review  has 
been  written  by  Smith  (1978).  It  suffices  for  the  present  purpose  to  indicate  what  kinds 
of  experimental  evidence  are  available  to  support  the  claim  that  words  are  doubly  entered 
in  lexical  memory. 

Although  the  work  outlined  in  this  paper  is  not  basic  research  in  the  sense  that  the 
experimental  studies  just  mentioned  clearly  are,  it  can  nevertheless  contribute  to  the 
understanding  of  the  organization  of  lexical  memory.  The  contribution  should  follow 
from  the  inclusion  of  a  sizeable  fraction  of  the  English  lexicon,  which  can  act  as  an  anti¬ 
dote  against  premature  enthusiasm.  Psychological  experiments  are  almost  necessarily 
conducted  with  a  small  number  of  words  and  then  assumed  (often  implicitly)  to  general¬ 
ize  over  the  entire  vocabulary.  A  failure  to  look  for  negative  evidence  can  tempt  one  into 
serious  mistakes. 

This  temptation  can  be  strong  when  lexical  properties  and  relations  are  at  issue. 
When  only  bits  and  pieces  of  lexical  data  have  been  examined,  a  theorist  may  begin  to 
see  patterns,  to  formulate  hypotheses,  and  to  search  for  examples  to  support  those 
hypotheses.  Moreover,  supporting  examples  are  usually  found:  it  is  easy  to  find  words 
that  will  fit  nicely  into  almost  any  pattern  a  reasonable  person  might  invent.  But  the  fact 
that  supporting  examples  can  be  found  does  not  really  test  the  hypothesis.  A  list  of  posi¬ 
tive  instances — even  a  long  list — offers  no  assurance  that  there  are  no  negative  instances. 
Therefore,  in  order  to  avoid  favoritism  (even  unconscious  favoritism)  for  words  that 
confirm  one’s  hypothesis,  it  is  advisable  to  test  hypotheses  against  a  large  collection  of 
words,  a  collection  assembled  in  ignorance  of  the  hypotheses  in  question. 

WordNet:  Implementation  of  a  Model  of  Lexical  Organization 

WordNet  is  an  electronic  lexical  reference  system  designed  in  accordance  with  the 
theories  summarized  above.  The  first  step  in  creating  WordNet  was  to  invent  an  elec¬ 
tronic  version  of  the  vocabulary  matrix. 

Synonym  Sets 

A  major  problem  facing  anyone  who  would  construct  a  vocabulary  matrix  is  how  to 
represent  all  the  various  concepts  that  words  can  express. 


Lexicographers  represent  lexical  concepts  by  circumlocution.  That  is  to  say,  they 
use  words  to  define  words.  Lexicographers  take  great  pains  to  distinguish  among  dif¬ 
ferent  senses  that  a  given  word  can  express,  but  they  pay  far  less  attention  to  establishing 
a  common  phrasing  for  the  same  sense  when  it  appears  in  entries  for  different  words.  For 
example,  in  one  widely  used  dictionary  the  same  lexical  concept  is  phrased  as  "inferior  in 
quality  or  value"  in  the  definition  of  poor  and  as  "of  little  or  less  importance,  value,  or 
merit,"  in  the  definition  of  inferior.  If  WordNet  represented  the  lexical  concepts  in  the 
vocabulary  matrix  by  definitional  phrases  borrowed  from  a  conventional  dictionary, 
many,  perhaps  most,  synonymic  relations  would  be  overlooked. 

Some  standard  convention  for  expressing  word  senses  is  required.  At  first  glance  it 
might  seem  that  there  are  many  options  to  choose  among.  Many  different  notations  for 
lexical  concepts  have  been  proposed  (see,  for  example,  Anderson,  1976;  Cullingford, 
1986;  Jackendoff,  1983;  Katz,  1972;  Miller  &  Johnson-Laird,  1976;  Norman  & 
Rumelhart,  1975;  Schank,  1972;  Sowa,  1984;  Talmy,  1985).  It  might  be  possible  to  iden¬ 
tify  one  best  suited  for  the  present  purpose.  But  such  notations,  although  easier  to  stand¬ 
ardize  than  the  usual  circumlocutions,  have  been  worked  out  in  detail  for  only  small  sets 
of  English  words,  usually  for  whatever  words  happen  to  have  been  used  for  demonstra¬ 
tion  purposes. 

How,  then,  should  the  list  of  lexical  concepts  be  constructed?  In  order  to  proceed 
with  WordNet,  we  have  used  synonym  sets  to  represent  lexical  concepts.  That  is  to  say, 
the  identifier  for  the  concept  on  any  given  row  of  the  vocabulary  matrix  is  given  by  the 
list  of  words  that  (in  appropriate  contexts)  can  be  used  to  express  that  concept.  Actually, 
since  the  synonym  sets  will  be  numbered,  each  concept  will  be  represented  in  the  system 
by  a  number,  but  displayed  to  the  user  as  a  set  of  words  having  a  shared  meaning. 

It  should  be  noted  that  synonym  sets,  unlike  dictionary  entries,  do  not  have  head¬ 
words.  In  a  book  of  synonyms,  for  example,  one  entry  might  have  pipe  as  the  headword, 
alphabetized  under  P  with  "tube"  as  its  contents,  and  another  entry  might  have  tube  as 
the  headword,  alphabetized  under  T  with  "pipe"  as  its  contents.  In  WordNet,  the 
synonym  set  {  pipe,  tube,  }  stands  as  an  elementary  component,  and  neither  word  is 
ahead  of  the  other.  This  practice  has  the  advantage  of  symmetry:  if  x  is  a  synonym  of  y, 
then  y  is  a  synonym  of  jc. 

Because  synonymy  is  so  central  to  the  design  of  WordNet,  it  resembles  the  elec¬ 
tronic  thesauruses  that  are  now  becoming  available  commercially  (Raskin,  1987).  Word- 
Net  goes  beyond  those  products,  however,  by  incorporating  conceptual  relations  other 
than  synonymy — as  will  be  described. 

The  Master  List 

Once  a  satisfactory  list  of  synonym  sets  becomes  available,  it  will  be  simple  to 
index  it  That  is  to  say,  an  alphabetical  listing  of  all  the  words  in  all  the  synonym  sets  can 
be  constructed  where  each  word  is  followed  by  the  numbers  of  all  the  synonym  sets  of 
which  it  is  a  member.  This  list,  which  we  have  been  referring  to  as  the  master  list,  can 
also  contain  information  that  is  word-specific  and  not  dependent  on  the  concepts  that  the 
word  can  be  used  to  express.  For  example,  the  master  list  will  include  information  con¬ 
cerning  the  relative  frequency  of  use  of  each  word. 
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Conceptual  Relations 

As  of  the  end  of  1987,  the  WordNet  files  included  1 1,500  different  nouns  organized 
into  over  7,000  synonym  sets;  approximately  6,000  different  verbs  organized  into  over 
3,000  synonym  sets;  and  9,500  different  adjectives  organized  into  over  8,200  synonym 
sets.  That  gave  a  total  of  over  27,000  different  words  organized  into  approximately 
18,200  synonym  sets.  The  next  step  was  to  introduce  relations  between  lexical  concepts: 
not  only  semantic  relations  (Cruse,  1986;  Evens  et  al.,  1983;  Lyons,  1977,  ch.  9),  but  oth¬ 
ers  as  well.  While  additional  synonym  sets  continued  to  be  added,  we  are  now  introduc¬ 
ing  cross-references  designed  to  represent  conceptual  relations. 

Conceptual  relations  are  represented  in  WordNet  by  cross-references  between 
synonym  sets.  Each  synonym  set,  therefore,  will  be  followed  by  a  list  of  the  numbers  of 
other  synonym  sets  related  to  it  in  particular  ways. 

Hyponymy,  for  example,  can  be  introduced  in  WordNet  by  appending  to  a  given 
synonym  set  one  number  tha .  points  to  its  superordinate  term  and  other  numbers  that 
point  to  its  hyponyms.  The  relation  of  meronymy  is  similar.  Since  meronymy  generates 
a  part-whole  hierarchy  that  is  structurally  similar  to  a  hyponymic  hierarchy,  it  can  be 
introduced  in  WordNet  in  a  similar  manner,  by  labeled  cross-references. 

The  Hyponymic  Hierarchy 

Cognitive  psychologists  have  been  interested  in  lexical  hierarchies  at  least  since 
Collins  and  Quillian  (1969)  proposed  them  as  a  model  of  semantic  memory.  According 
to  the  theory,  concepts  are  nodes  linked  by  labeled  arcs.  Workers  in  artificial  intelligence 
had  observed  that  a  hierarchy  of  nodes  linked  by  ISA  relations  is  an  efficient  storage  sys¬ 
tem:  since  all  of  the  properties  attributed  to  a  superordinate  node  are  inherited  by  its 
hyponyms,  those  properties  need  be  stored  only  once — they  need  not  be  stored  separately 
with  every  hyponym.  For  example,  when  you  are  told  that  Cuthbert  is  a  cat  you  know 
immediately  that  Cuthbert  purrs,  has  four  legs,  fur,  retractable  claws,  and  so  on.  It  is  not 
necessary  to  learn  each  property  separately. 

During  the  past  quarter  century,  therefore,  the  hyponymic  hierarchy  for  nominal 
concepts  has  been  widely  exploited.  For  example,  the  psychologist  Keil  (1979)  called  it 
an  "ontological  tree"  and  used  it  to  organize  his  observations  of  vocabulary  growth  in 
young  children.  Other  workers  have  not  found  the  hierarchy  as  neat  and  tidy  as  Keil  did: 
the  computer  scientist  Cullingford  (1986),  called  it  a  "tangled  ISA-hierarchy”  (e.g.,  knife 
is  a  hyponym  of  both  utensil  and  weapon )  and  used  it  as  the  basic  classification  scheme 
underlying  his  natural  language  processing  system.  Others  have  proposed  other  varia¬ 
tions.  But  even  those  who  disagree  about  the  details  do  agree  on  the  general  idea  that 
some  kind  of  semantic  hierarchy  is  required  in  order  to  represent  lexical  knowledge. 

It  is  not  difficult  to  construct  demonstrations  based  on  small  fragments  of  the  hypo¬ 
nymic  hierarchy,  but  constructing  it  for  a  broad  sample  of  the  English  lexicon  is  a  for¬ 
midable  task.  Much  of  the  information  required  is  contained  in  the  defining  phrases  of 
standard  dictionaries,  where  a  common  form  of  definition  is:  "x  is  a  y  that  P,"  where  x  is 
a  hyponym  of  y  and  P  is  a  relative  clause  that  distinguishes  x  from  the  other  hyponyms  of 
y.  For  example.  The  Longman's  Dictionary  of  Contemporary  English  (Procter,  1978) 
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says  that  a  TREE  is  "a  type  of  tall  PLANT  with  a  wooden  trunk  and  branches,  that  lives  for 
many  years,"  from  which  it  is  obvious  that  TREE  is  a  hyponym  of  PLANT. 

This  kind  of  information  can  be  extracted  from  a  machine-readable  dictionary 
(Amsler,  1980, 1981;  Amsler  &  White,  1979;  Chodorow,  Byrd  &  Heidom,  1985).  The 
results  make  it  clear  that  lexicographers  work  with  a  fundamentally  consistent  semantic 
hierarchy.  Unfortunately,  however,  definitions  in  standard  dictionaries  are  not  written 
with  this  analysis  in  mind,  and  fortuitous  variations  in  the  phraseology  of  related 
definitions  sometimes  obscure  their  relatedness. 

One  feature  of  dictionaries  that  deserves  comment  is  that  it  is  much  easier  to  iden¬ 
tify  superordinates  from  the  defining  phrases  than  to  identify  hyponyms.  For  example, 
the  definition  of  tree  will  almost  necessarily  say  that  a  tree  is  a  plant,  but  it  will  not  go  on 
to  say  that  apple,  elm,  fir,  maple,  pine,  etc.  are  all  trees;  for  that  information  a  user  must 
consult  the  individual  entries  for  apple,  elm,  fir,  etc.,  which  presupposes  that  users 
already  have  the  information  that  they  are  searching  for.  In  WordNet  moving  down  the 
hyponymic  hierarchy  should  be  as  easy  as  moving  up. 

The  hyponymic  hierarchy  is  also  apparent  in  standard  thesauruses:  Roget’s  Interna¬ 
tional  Thesaurus  has  6  to  8  tiers  of  categories,  going  progressively  from  highly  abstract 
generic  categories  to  highly  concrete  specific  categories.  However,  Roget  and  his  suc¬ 
cessors  were  not  slavishly  devoted  to  the  hyponymic  relation,  and  careful  judgment  is 
sometimes  required  in  order  to  extract  the  hyponymic  relation  from  all  the  other  informa¬ 
tion  in  an  entry.  For  example,  in  Chapman’s  (1977)  version  of  Roget’s  thesaurus  the 
path  from  the  root  of  the  hierarchy  out  to  one  sense  of  the  word  pipe  goes  as  follows: 

Class  Two:  Space 
III.  Structure;  Form 
B.  Special  Form 
255.  Sphericity,  Rotundity 
Nouns 

255.4  cylinder,  cylindroid,  cylindr(o)- 
pipe,  tube 

Although  one  can  agree  that  a  pipe  is  a  cylinder  and  that  a  cylinder  is  a  form,  the  rest  of 
this  path  introduces  other  kinds  of  information.  In  particular,  the  more  generic  concepts 
seem  rather  arbitrary.  Sedelow  and  Sedelow  (1986)  comment  that  there  is  much  greater 
descriptive  and  analytic  power,  semantically,  in  the  lower  tiers  of  Roget’s  thesaurus. 

In  most  cases,  the  judgments  required  to  settle  questions  about  hyponymic  relations 
are  not  difficult.  In  order  to  decide  whether  x  is  a  hyponym  of  y,  substitute  them  into  a 
standard  frame  of  the  form:  x  ISA  y,  then  judge,  on  the  basis  of  general  knowledge  about 
such  things,  whether  the  resulting  proposition  is  true  or  false.  If  it  is  true,  then  x  can  be 
accepted  as  a  hyponym  ofy.  Uncertainty  about  the  truth  value  may  complicate  the  judg¬ 
ment  when  the  judge  is  not  knowledgeable  about  .r’s  and  y’s,  or  when  highly  abstract 
concepts  are  involved,  e.g.,  is  VIRTUE  a  hyponym  of  IDEA?  But  the  large  majority  of 
cases  are  easily  decided. 
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By  using  a  collection  of  dictionaries  and  thesauruses,  liberally  seasoned  with 
linguistic  intuitions,  WordNet  editors  have  introduced  hyponymic  relations  into  the 
synonym  sets  with  relatively  little  trouble.  In  some  cases,  a  word  that  seems  to  have  no 
obvious  synonym  can  be  tied  into  the  semantic  structure  through  its  superordinate.  Blun¬ 
derbuss,  for  example,  has  no  good  synonym  in  English,  but  it  can  be  integrated  into 
WordNet  as  a  hyponym  of  firearm.  In  other  instances,  an  initial  synonym  set  can  be 
reorganized;  coordinate  terms — names  of  trees,  for  example — that  were  entered  initially 
as  a  synonym  set  could,  with  the  introduction  of  hyponymic  relations,  be  entered  more 
accurately  as  hyponyms — in  this  example,  as  hyponyms  of  tree.  In  general,  the  addition 
of  hyponymy  has  had  the  effect  of  sharpening  the  semantic  distinctions  that  can  be  drawn 
and,  as  a  consequence,  reducing  the  average  size  of  the  synonym  sets.  Considerable 
work  is  sometimes  required  to  reach  a  satisfactory  solution;  in  those  cases  care  has  been 
taken  not  to  impose  more  order  on  WordNet  than  a  literate  speaker  of  English  might  find 
reasonably  obvious. 

Antonymic  Clusters 

Psychologists  also  have  an  interest  in  antonymy,  since  antonyms  are  so  often  used 
to  anchor  the  ends  of  scales  used  in  subjective  judgments:  good-bad,  agree-disagree, 
right-wrong,  etc.  Probably  the  most  extensive  use  of  antonyms  for  scaling  purposes  was 
Osgood,  Suci,  and  Tannenbaum’s  (1957)  attempt  to  map  all  concepts  into  a  space  whose 
coordinates  were  given  by  pairs  of  antonymous  adjectives. 

Not  every  word  has  an  antonym,  of  course.  This  relation  is  probably  clearest 
between  adjectives,  although  it  is  by  no  means  limited  to  adjectives.  The  adjectival 
synonym  sets  were  chosen  as  the  most  appropriate  place  to  introduce  antonymy  into 
WordNet. 

The  work  began  with  the  assumption  that  antonymy  and  synonymy  are  themselves 
opposites.  That  is  to  say,  synonyms  are  words  whose  meanings  are  very  similar,  whereas 
antonyms  are  words  whose  meanings  are  very  dissimilar.  That  assumption  may  suffice 
as  long  as  one  does  not  look  too  closely,  but  careful  analysis  reveals  important  differ¬ 
ences.  The  long  history  of  disagreement  about  the  nature  and  definition  of  antonymy 
(Egan,  1984)  should  have  been  a  warning,  but  the  extent  of  the  difference  was  not  recog¬ 
nized  until  an  attempt  was  made  to  represent  antonymous  pairs  by  symmetrical  cross- 
references  between  contrasting  synonym  sets. 

The  design  of  WordNet  landed  it,  inadvertently,  in  the  middle  of  a  traditional  argu¬ 
ment  about  antonymy.  Is  an  antonym  (1)  any  one  of  several  words  that  can  be  opposed 
to  a  group  of  synonymous  terms,  or  is  it  (2)  a  single  word,  or  at  most  one  of  two  or  three 
words,  that  can  be  opposed  to  a  given  word?  As  originally  conceived,  WordNet  incor¬ 
porated  assumption  (1).  That  is  to  say,  relatively  large  groups  of  synonyms  were  first 
compiled;  then  attempts  were  made  to  cross-reference  the  antonymous  sets.  But  it 
proved  difficult  to  carry  that  program  through.  When  synonym  set  C  was  put  in  opposi¬ 
tion  to  synonym  set  C,  not  every  word  in  C.  was  an  antonym  of  every  word  in  C ,  and 
vice  versa,  and  that  fact  made  it  difficult  to  judge  whether  the  concepts  represented  by 
the  synonym  sets  were  truly  antonymous. 
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For  example,  the  concept  that  is  represented  by  the  synonym  set  {  damp,  dank, 
drenched,  moist,  soaked,  waterlogged,  wet }  seems  to  be  antonymous  to  the  concept  that 
is  represented  by  the  synonym  set  {  arid,  baked,  dehydrated,  dessicated,  dry,  parched, 
sere,  withered  } ,  but  few  people  would  think  of  withered  as  an  antonym  of  waterlogged, 
say,  or  of  baked  as  an  antonym  of  dank,  etc.  Assumption  (1)  defines  antonymy  as  a  rela¬ 
tion  between  lexical  concepts,  whereas  assumption  (2)  defines  antonymy  as  a  relation 
between  words.  Judgments  of  antonymy  are  much  easier  to  make  between  words  than 
between  concepts. 

The  addition  of  antonymous  relations  sharpens  considerably  the  semantic  distinc¬ 
tions  that  are  required.  That  is  to  say,  the  adoption  of  assumption  (2)  necessarily  limits 
the  number  of  words  in  many  synonym  sets  to  two  or  three.  But  the  notion  that  anto¬ 
nymy  is  a  relation  between  words,  rather  than  between  concepts,  finds  support  in  the  fre¬ 
quent  use  of  morphology  to  signal  antonymy:  perfect-imperfect,  advantageous- 
disadvantageous,  benevolent-malevolent,  powerful-powerless,  superior-inferior, 
definite -indefinite,  etc.  illustrate  only  a  few  of  the  ways  in  which  derivational  morphol¬ 
ogy  serves  this  purpose.  Or,  to  put  it  differently,  prefixing  un-  to  adjectives  can  result  in 
new  adjectives  (pleasant-unpleasant )  in  much  the  same  way  that  adding  en-  to  adjectives 
can  result  in  causative  verbs  (rich- enrich).  In  both  cases  the  affix  does  important  seman¬ 
tic  work,  but  both  dyads  reflect  formal  relations  between  pairs  of  words.  This  is  con¬ 
sistent  with  assumption  (2),  which  defines  antonymy  as  a  relation  between  words. 

Moreover,  if  it  is  assumed  that  the  morphological  relations  involved  in  particular 
antonymous  pairs  must  be  learned  by  repeated  exposure  and  practice,  much  the  way  all 
formal  (i.e.,  phonological  and  morphological)  features  of  English  are  learned,  then  other 
observations  about  antonyms  could  be  explained.  For  example,  although  big-little  and 
large-small  are  both  antonymous  pairs,  it  sounds  odd  to  cross  them:  big-small  and  large- 
little.  The  explanation  is  that  we  have  heard  them  paired  one  way  much  more  frequently 
than  the  other.  Although  the  cross  is  conceptually  correct,  it  is  morphologically  unfami¬ 
liar. 

How  can  a  conceptual  definition  of  synonymy  coexist  with  a  formal  conception  of 
antonymy?  Or,  in  more  practical  terms,  how  can  a  loose  definition  of  synonymy  be  com¬ 
bined  with  a  strict  definition  of  antonymy?  Solving  this  practical  problem  forced  an 
interesting  structure  onto  the  adjective  file:  antonym  pairs  must  form  the  basic  skeleton 
of  adjectival  semantics,  and  this  skeleton  is  fleshed  out  by  those  adjectives  that  have  no 
obvious  antonym  but  are  similar  to  adjectives  that  do  have  antonyms.  That  is  to  say, 
another  relation,  dubbed  semantic  similarity,  is  introduced  to  preserve  sets  of  several 
synonyms,  but  without  precluding  the  one:one  pairing  of  antonyms. 

The  result  is  illustrated  in  Table  1  by  the  cluster  of  concepts  around  the  antonymous 
pair  wet-dry.  (The  ‘a’  following  each  number  indicates  that  it  is  the  name  for  an  adjec¬ 
tival  synonym  set.)  If  dry  in  1005a  is  consulted  in  search  of  an  antonym,  wet  will  be 
found  in  1000a  (and  vice  versa),  whereas  if  dry  in  1015a  is  consulted,  the  antonym  in 
1070a  will  be  sweet.  On  the  other  hand,  if  1005a  is  consulted  for  near  synonyms  of  dry, 
all  the  words  in  1006a,  1007a,  1008a,  1009a,  and  1014a  will  be  found.  Thus,  a  narrow 
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Table  1 

The  antonymic  cluster,  wet-dry 
(Antonymic  relation,  *;  similarity  relation,  &) 

1000a  (  wet,&1001a,&1002a,&1003a,*1005a, } 

1001a  {  damp,  dank,  moist,  &  1000a, ) 

1002a  (  drenched,  saturated,  soaked,  waterlogged,  &  1000a, } 

1003a  {  foggy,  humid,  misty,  rainy,  &  1000a, } 

1004a  {  drunk,  slopped,  tipsy,  wet,  *  1080a,  ) 

1005a  {  dry,  *1000a,  &1006a,  &1007a,  &I008a,  &I009a,  &I014a, } 

1006a  {  arid,  &1005a, ) 

1007a  {  dehydrated,  dessicated,  sere,  withered,  &1005a, } 

1008a  {  baked,  parched,  &  1005a, } 

1009a  { thirsty,  &1005a, ) 

1010a  {  dry,  impassive,  matter-of-fact,  unemotional,  *  1020a, ) 

1011a  [  barren,  dry,  sterile,  unproductive,  *  1030a, } 

1012a  (  boring,  dry,  insipid,  wearisome,  &1040a,  &1090a, } 

1013a  (  bare,  dry,  plain,  unadorned,  *1060a, } 

1014a  {  anhydrous,  &  1005a, } 

1015a  {  dry,  &1110a,  *  1070a  } 

1020a  {  emotional,  *10 10a, } 

1030a  {  fruitful,  productive,  *  101  la, } 

1040a  {  dull,  &  1012a,  &  1090a, } 

1050a  { interesting,  *  1090a, } 

1060a  {  adorned,  fancy,  *1013a, } 

1070a  [  sweet,  *1015a,  &1100a, ) 

1080a  {  dry,  sober,  *  1004a  } 

1090a  (  uninteresting,  &1012a,  &1040a,  *1050a,  ) 

1100a  {  sugary,  *1110a,  &1070a, } 

1110a  {  sugarless, &1015a,  *  1100a, } 

interpretation  of  antonymy  can  coexist  with  a  broad  interpretation  of  synonymy.  More¬ 
over,  this  form  of  representation  poses  no  special  problems  for  polysemous  words:  the 
dry  that  is  the  antonym  of  wet  expresses  a  different  concept  from  the  dry  that  is  the  anto¬ 
nym  of  sweet ,  and  different  also  from  the  dry  that  is  similar  to  dull  and  uninteresting. 

Implicit  in  the  adoption  of  this  structure  for  WordNet  is  the  hypothesis  that  native 
speakers  of  English  have  a  similar  organization  of  their  lexical  memory  for  antonyms. 
That  hypothesis  can  be  tested,  of  course.  It  would  not  be  difficult  to  design  an  experi¬ 
ment  that  would  determine  whether  native  speakers  of  English  can  judge  pairs  like  wet- 
dry  to  be  antonyms  faster  than  they  can  judge  indirect  pairs  like  dank-dry,  or  doubly 
indirect  pairs  like  dank-parched.  The  possibility  of  conducting  such  experiments  serves 
to  illustrate  one  way  that  the  present  work  contributes  to  our  understanding  of  the 
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organization  of  lexical  memory.  As  future  work  incorporates  meronymy,  association, 
and  verb  groups,  further  facets  of  the  organization  of  lexical  memory  may  become 
apparent. 

Meronymy 

Meronymy,  the  part-whole  relation,  is  another  basic  semantic  relation  between 
words  and  concepts.  This  relation  turns  out  to  play  a  prominent  role  in  the  noun  com¬ 
ponent  of  the  lexicon  and  is  widely  exploited  in  WordNet.  Winston,  Chaffin  and  Herr- 
man  (1988;  also  Chaffin,  Herrmann  and  Winston,  1987)  studied  a  wide  variety  of  part- 
whole  relations. 

The  most  easily  identifiable  examples  of  meronymy  are  found  among  words  denot¬ 
ing  concrete  and  countable  entities.  Body  parts,  for  example,  lend  themselves  well  to 
part-whole  classification:  a  finger  is  a  part  of  a  hand ,  a  hand  is  a  part  of  an  arm,  and  an 
arm  is  a  part  of  a  body. 

Another  kind  of  meronymy  is  represented  by  those  cases  where  the  concept  of  the 
whole  exists  only  by  virtue  of  the  existence  of  a  multiple  of  the  parts  and  is  conceptually 
and  linguistically  inseparable  from  them,  as  in  the  example  a  tree  is  a  part  of  a  forest. 
Thus,  one  can  say  a  forest  is  many  trees  but  not,  for  example,  a  body  is  many  arms. 

In  the  lexicon  of  nouns  referring  to  substances,  meronymy  again  takes  on  a  slightly 
different  meaning.  As  Lyons  (1977)  points  out,  gold  is  a  substance  and  it  can  also  be  a 
part  of  a  compound  matter.  Thus,  we  can  say  both  this  substance  is  gold  and  gold  is  part 
of  this  substance.  But  the  same  does  not  hold  for  arm:  Although  we  can  say  The  finger 
is  part  of  an  arm,  we  cannot  say  This  arm  is  a  finger. 

Meronymy  overlaps  with  hyponymy  in  the  case  of  collective  nouns  such  as  furni¬ 
ture:  While  table  is  a  kind  of furniture,  it  is  also  part  of  furniture,  in  the  sense  that  the 
concept  furniture  can  be  said  to  prototypically  include  the  concept  table.  The 
classification  of  such  collectives  can,  therefore,  be  problematic. 

In  the  realm  of  concrete  and  count  nouns,  meronymy  permits  the  establishment  of 
hierarchical  structures  in  parallel  with,  but  distinct  from,  hyponymic  structures.  Mero- 
nymic  relations,  like  hyponymic  relations,  are  also  transitive,  in  that  we  can  say  that  if  x 
is  a  part  of  y,  and  y  is  a  part  of  z,  then  x  is  also  part  of  z.  For  example,  a  foot  is  a  mero- 
nym  of  leg  and  leg  is  a  meronym  of  body,  therefore,  foot  is  a  meronym  of  body. 
Bierwisch  (1965)  discusses  redundancies  in  these  meronymic  structures  and  asks  to  what 
extent  they  should  be  eliminated  by  rules.  It  would  be  interesting  to  test  whether  and 
how  meronymic  transitivity  is  represented  in  lexical  memory:  e.g.,  to  see  whether  sub¬ 
jects  will  easily  associate  two  words  that  are  distantly  related  by  meronymy  such  as 
doorknob  and  house,  and  if  such  associations  require  more  time  than  those  between  less 
distantly  related  words  like  door  and  house. 

Interesting  relations  exist  between  the  hyponymic  hierarchy  and  the  meronymic 
hierarchies.  For  example,  it  is  not  necessary  to  say  that  deck  is  a  meronym  of  warship  if 
it  has  already  been  said  that  deck  is  a  meronym  of  the  superordinate  ship.  Tversky  and 
Hemenway  (1984)  argue  that  the  appropriate  level  in  the  hyponymic  hierarchy  for  enter¬ 
ing  part- whole  relations  is  the  level  that  has  been  called  "basic"  by  anthropological 


linguists  (Berlin,  Breedlove,  &  Raven,  1966;  Rosch,  Mervis,  Gray,  Johnson,  &  Boyes- 
Braem,  1976). 

Hyponymy,  antonymy,  and  meronymy  reflect  different  aspects  of  the  organization 
of  human  lexical  memory  and  they  all  differ  from  synonomy.  Consequently,  the  four 
relations  must  be  represented  differently  in  WordNet.  Not  until  experience  had  been 
gained  with  this  task,  however,  was  the  extent  of  their  differences  and  interrelations 
appreciated.  In  the  final  section  of  this  paper,  we  discuss  the  role  of  these  relations  in  the 
verbal  lexicon,  which  presents  a  great  challenge  to  any  lexicographer. 

Semantic  Relations  in  the  Verbal  Lexicon 

At  present,  over  3,000  synonym  sets  of  verbs  have  been  compiled.  They  were  ini¬ 
tially  classified  into  a  dozen  groups  along  the  lines  suggested  in  Miller  and  Johnson-Laird 
(1976).  This  classification  follows  very  general  but  intuitively  basic  semantic  criteria; 
thus,  we  have  verbs  of  possession,  communication,  emotion,  mental  state  and  activity, 
motion,  and  so  on.  The  semantic  relations  of  hyponymy,  antonymy,  and  meronymy,  that 
serve  naturally  to  relate  nouns  and  adjectives  turn  out  to  be  less  fitting  for  verbs. 

Superficially,  verbs  do  not  seem  to  be  easily  represented  by  a  hyponymic  taxonomy. 
Rather  than  functioning  as  true  hyponyms  of  a  superordinate  term,  clusters  of  verbs  seem 
to  be  related  to  a  core  or  genus  verb  via  a  manner  relation.  Rather  than  bearing  an  ISA 
relation,  a  verb’s  relation  to  its  genus  term  is  expressable  by  means  of  a  formula  such  as 
to  VI  is  to  V2  in  some  way.  For  example,  to  sew  is  to  make  by  drawing  together  with  a 
needle  and  thread.  However,  further  examination  of  the  taxonomy  of  make- type  verbs  in 
comparison  with  other  verb  classes  reveals  the  existence  of  an  intermediate  "superordi¬ 
nate"  level  that  behaves  regularly  with  respect  to  a  taxonomic  hierarchy:  namely,  where 
the  subordinate  verbs  bear  an  ISA  relation  to  their  superordinates  (Fellbaum  &  Kegl, 
forthcoming). 

The  architecture  of  internal  verb  class  taxonomies  is  confounded  by  apparently  ran¬ 
dom  lexical  gaps  at  both  the  superordinate  and  the  subordinate  levels.  Consider  first  the 
taxonomic  organization  of  two  standardly  recognized  verb  classes:  the  creation  class  and 
the  change-of-state  class.  (See  Atkins,  Kegl,  and  Levin  (1988)  for  a  discussion  of  the 
semantic  and  syntactic  evidence  for  putting  bake  into  both  the  creation  class  and  the 
change-of-state  class.) 
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CREATION  CLASS 


genus:  [MAKE] 


superordinate: 


MAKE  (manner)  [not  lexicalized] 


basic  level:  weave  sew  knit  pHintbake 


subordinate  level: 


machine-knit, hand-knit 


CHANGE-OF-STATE  CLASS 


genus:  [  ] 


superordinate: 


basic  level:  broi 


COOK 


boil  roast  bake 


subordinate  level:  sti^fiy ,deef>-fry 


Notice  that  these  two  classes  differ,  at  the  basic  level,  with  regard  to  a  transitivity 
alternation  involving  indefinite  object  deletion.  The  creation  class  permits  deletion  (see 
1),  whereas  the  change-of-state  class  does  not  (see  2). 

(1)  a.  John  is  knitting  an  afghan. 
b.  John  is  knitting. 

(2)  a.  Elaine  is  roasting  a  goose. 

b.  *Elaine  is  roasting,  [where  Elaine  is  the  agent] 

The  two  classes  pattern  identically  at  the  subordinate  level,  although  above  that 
level  they  appear  to  diverge.  This  divergence  is  a  consequence  of  the  presence  or 
absence  of  lexicalization  of  the  superordinate  term.  The  change-of-state  class  allows 
indefinite  object  deletion  with  cook  but  the  creation  class  does  not  allow  the  same  option 
with  make.  Notice  that  the  change-of-state  class  has  a  lexicalized  superordinate  term, 
cook  (meaning  change  food  by  heating  in  some  manner)  but  the  creation  class  does  not. 
Although  the  creation  class  has  the  genus  term  MAKE  (with  no  interpretation  involving 
manner  of  making),  at  the  superordinate  level  it  lacks  a  lexical  item  corresponding  to 
"make  in  some  manner"  ( *John  is  making.)  This  lack  of  lexicalization  leads  us  to  recog¬ 
nize  the  existence  of  a  higher  level  of  organization. 
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Indefinite  object  deletion  is  not  an  inherent  property  of  the  creation  class  and  the 
change-of-state  class  per  se,  but  rather  is  linked  to  the  fact  that  members  of  the  superordi¬ 
nate  level  of  both  these  classes  can  function  as  activity  verbs  (like  eat,  read,  dance, 
clean).  In  this  activity  verb  realization  the  indefinite  object  can  be  omitted.  Notice  that  at 
the  subordinate  level  neither  the  change-of-state  nor  creation  classes  allow  indefinite 
object  deletion. 

Two  synonym  sets  on  the  subordinate  level  can  be  said  to  be  antonyms  if  the 
manner  relation  by  which  they  differ  is  antonymic.  For  example,  nibble  and  gorge  are 
antonymns  because  they  are  related  to  eat  by  little,  slow  and  by  much,  fast,  respectively. 

Antonymy  also  shows  up  systematically  among  verbs  denoting  a  change  from  one 
state  to  another  where  each  state  can  be  related  to  a  quality  (e.g.,  lighten  and  darken  are 
antonyms  by  virtue  of  the  antonymic  relation  that  holds  between  the  two  adjectives  from 
which  they  are  derived). 

Antonymy  in  the  verbal  lexicon  is,  for  the  most  part,  a  secondary  semantic  relation 
derived  from  adjectives  (of  manner,  degree,  or  intensity)  or  from  spatial  relations,  among 
which  it  is  a  primary  relation.  Whenever  an  antonymic  relation  cannot  be  imported  from 
elsewhere  in  the  lexicon,  we  might  expect  a  verb  pair  to  lack  an  antonymic  relation. 

Meronymy,  which  was  found  to  play  a  significant  role  as  a  semantic  relation  among 
nouns,  is  not  found  in  the  same  way  among  verbs.  Its  counterpart  in  the  verbal  lexicon 
seems  to  be  presupposition,  in  that  one  may  say  of  a  verb  to  VI  presupposes  to  V2.  For 
example,  to  dream  presupposes  to  sleep.  While  the  superordinates  can  also  be  said  to  be 
presupposed  by  the  subordinates  in  their  synonym  sets,  presupposition  and  the  kind  of 
hyponymy  outlined  for  verbs  are  distinct  and  asymmetric  relations:  dream  presupposes 
sleep,  but  dreaming  is  not  a  kind  of  sleeping. 

Besides  a  manner  relation  that  is  hyponymic  in  nature,  an  antonymy  relation  that  is 
secondary  and  inherited  from  other  lexical  categories,  and  a  unidirectional  presupposition 
relation  from  the  subordinate  to  the  superordinate  level,  WordNet  recognizes  an  addi¬ 
tional  discriminator,  which  assigns  words  to  a  particular  semantic  domain.  For  example, 
the  polysemous  verb  beat  is  more  readily  disambiguated  when  associated  with  different 
semantic  domains:  culinary,  musical,  contact,  competitive,  and  so  on. 

Finally,  each  synonym  set  will  be  matched  with  a  frame  specifying  the 
semantic/syntactic  restrictions  (a  combination  of  subcategorizations  and  selectional  res¬ 
trictions)  of  its  members.  WordNet  is  intended  for  use  by  linguistically  unsophisticated 
users.  Therefore,  the  codings  must  be  simple  and  straightforward,  drawing  upon  lexical 
knowledge  the  user  already  possesses.  The  coding  task  also  presents  some  interesting 
theoretical  challenges.  It  is  not  clear  at  this  point  how  many  frames  will  be  needed  to 
account  for  all  the  verbs  on  file,  but  it  seems  desirable  to  keep  the  number  small  by  giv¬ 
ing  only  generic  specifications:  for  example,  NPhumju)  V  NPnon  human.  On  the  other  hand,  it 
is  hoped  that  the  frames  and  their  relations  to  the  synonym  sets  can  be  connected  in  some 
nonrandom  fashion  to  the  semantic  relations  among  the  verbs.  Some  of  the  semantic  dis¬ 
tinctions  made  in  the  relational  structures  of  possession  verbs,  for  example,  can  be  shown 
to  be  reflected  in  a  systematic  way.  The  verbs  relating  to  HAVE^  occur  in  the  frame 
NPhumin  V  NPnon-human  (J°hn  owns  a  car.).  The  subordinates  of  take  and  give  are 
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additionally  specified  for  a  prepositional  phrase  with  NPhuman  and  from  and  to,  respec¬ 
tively.  Moreover,  the  frames  show  the  difference  between  those  give  subordinates  that 
systematically  participate  in  the  dative  alternation  and  those  that  do  not  (NP  V  NP  NP  vs. 
NP  V  NP  to  NP). 

To  summarize:  significant  semantic  differences  exist  between  the  three  major  syn¬ 
tactic  categories  (noun,  adjective,  and  verb).  Words  from  the  three  categories  enter  into 
synonymy  relations  with  other  words,  yet  each  category  is  strongly  linked  to  one  addi¬ 
tional  predominant  relation  and  tends  to  resist  systematic  organization  by  means  of  other 
relations. 
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